Music generation and human voice conversion based on LSTM

نویسندگان

چکیده

Music is closely related to human life and an important way for people express their feelings in life. Deep neural networks have played a significant role the field of music processing. There are many different network models implement deep learning audio For general networks, there problems such as complex operation slow computing speed. In this paper, we introduce Long Short-Term Memory (LSTM), which circulating network, realize end-to-end training. The structure simple can generate better sequences after training model. After generation, voice conversion understanding inserting lyrics pure music. We propose segmentation technology segmenting fixed length voice. Different notes classified through piano without considering scale correlated with voices get. Finally, transformation, generated output Experimental results demonstrate that proposed scheme successfully obtain from by LSTM.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Using Backpropagation for Speech Texture Generation and Voice Conversion

Inspired by recent work on neural network image generation which rely on backpropagation towards the network inputs, we present a proof-of-concept system for speech texture synthesis and voice conversion based on two mechanisms: approximate inversion of the representation learned by a speech recognition neural network, and on matching statistics of neuron activations between different source an...

متن کامل

Statistical Voice Conversion with WaveNet-Based Waveform Generation

This paper presents a statistical voice conversion (VC) technique with the WaveNet-based waveform generation. VC based on a Gaussian mixture model (GMM) makes it possible to convert the speaker identity of a source speaker into that of a target speaker. However, in the conventional vocoding process, various factors such as F0 extraction errors, parameterization errors and over-smoothing effects...

متن کامل

Denoising Recurrent Neural Network for Deep Bidirectional LSTM Based Voice Conversion

The paper studies the post processing in deep bidirectional Long Short-Term Memory (DBLSTM) based voice conversion, where the statistical parameters are optimized to generate speech that exhibits similar properties to target speech. However, there always exists residual error between converted speech and target one. We reformulate the residual error problem as speech restoration, which aims to ...

متن کامل

Subband based voice conversion

A new voice conversion method that improves the quality of the voice conversion output at higher sampling rates is proposed. Speaker Transformation Algorithm Using Segmental Codebooks (STASC) is modified to process source and target speech spectra in different subbands. The new method ensures better conversion at sampling rates above 16KHz. Discrete Wavelet Transform (DWT) is employed for subba...

متن کامل

Voice conversion based on parameter transformation

This paper describes a voice conversion system based on parameter transformation [1]. Voice conversion is the process of making one person’s voice “source” sound like another person’s voice “target”[2]. We will present a voice conversion scheme consisting of three stages. First an analysis is performed on the natural speech to obtain the acoustical parameters. These parameters will be voiced an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: MATEC web of conferences

سال: 2021

ISSN: ['2261-236X', '2274-7214']

DOI: https://doi.org/10.1051/matecconf/202133606015